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Abstract — This paper investigates tlie Diversity-Multiplexing 
gain Trade-off (DMT) of a training based reciprocal Single Input 
Multiple Output (SIMO) system, with (i) perfect Channel State 
Information (CSI) at the Receiver (CSIR) and noisy CSI at the 
Transmitter (CSIT), and (ii) noisy CSIR and noisy CSIT. In 
both the cases, the CSIT is acquired through Reverse Channel 
Training (RCT), i.e., by sending a training sequence from the 
receiver to the transmitter. A channel-dependent fixed-power 
training scheme is proposed for acquiring CSIT, along with a 
forward-link data transmit power control scheme. With perfect 
CSIR, the proposed scheme is shown to achieve a diversity 
order that is quadratically increasing with the number of receive 
antennas. This is in contrast with conventional orthogonal RCT 
schemes, where the diversity order is known to saturate as the 
number of receive antennas is increased, for a given channel 
coherence time. Moreover, the proposed scheme can achieve a 
larger DMT compared to the orthogonal training scheme. With 
noisy CSIR and noisy CSIT, a three-way training scheme is 
proposed and its DMT performance is analyzed. It is shown 
that nearly the same diversity order is achievable as in the 
perfect CSIR case. The time-overhead in the training schemes is 
explicitly accounted for in this work, and the results show that 
the proposed channel-dependent RCT and data power control 
schemes offer a significant improvement in terms of the DMT, 
compared to channel-agnostic orthogonal RCT schemes. The 
outage performance of the proposed scheme is illustrated through 
Monte-Carlo simulations. 

Index Terms — Diversity-multiplexing gain tradeoff, MMSE 
channel estimation, training sequence. 



I. Introduction 

Reliability and system throughput are two fundamental 
parameters of interest in any wireless communication system, 
and the inherent tradeoff between the two at high SNR was 
elegantly captured by the Diversity Multiplexing gain Tradeoff 
(DMT) proposed in the seminal work of Zheng and Tse ||2l. 
It is known that a significant improvement in the outage 
performance can be obtained if the Channel State Information 
(CSI) at the receiver (CSIR) and the transmitter (CSIT) are 
perfect [S), ID, while ||2] considered perfect CSIR and no 
CSIT. 

In a Time Division Duplex (TDD) system, CSI could be 
estimated at the transmitter and receiver by sending a known 
training sequence in the forward and reverse-link directions, 
respectively. This has two consequences. First, the estimation 
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error results in incorrect data rate or power adaptation at 
the transmitter, in turn leading to higher outage rate. Second, 
training incurs a time overhead, which could be non-trivial 
when the training occupies a significant fraction of the channel 
coherence time, as it affects the pre-log term in the achievable 
data rate 0. This paper therefore focuses on the important 
problem of analytically comparing the DMT performance 
of different channel estimation techniques and identifying 
training signals and data power control schemes that result 
in a good performance in terms of the achievable DMT. We 
start with a brief survey of related literature. 

The impact of imperfect CSIT on the DMT of a multiple 
antenna system has been a popular area of research, and it is 
known that even with imperfect CSIR and CSIT, a significant 
improvement in DMT can be obtained, compared to the no- 
CSIT case (see, for example, 0-1181). The effect of imperfect 
CSIR on the DMT of a MIMO system was first studied in 
|[9]- The DMT analysis of a multiple antenna system with 
perfect CSIR and when the CSIT is modeled as the CSI plus 
Gaussian noise whose variance decreases with training SNR 
was investigated in lfT0l - lfT2l . In a TDD setup, the achievable 
DMT improvement using power control based on noisy CSIT 
was shown in lfT2l - lfT4l . Other works that study the DMT 
performance with quantized feedback of CSI and/or target 
data rate control based on noisy CSIT include JS), Q, ifTTI . 
(HSl-lIISl. In ini, im, the DMT of two-way and multi- 
round training schemes in a TDD system was derived. In 
these studies, the channel feedback signal on the reverse link 
is chosen to satisfy an average power constraint, rather than 
an instantaneous power constraint. 

Most of the aforementioned studies of the DMT with 
imperfect CSI typically ignore the training duration overhead. 
Hence, they are primarily applicable to slowly varying chan- 
nels, where the time overhead in training occupies an insignif- 
icant fraction of the channel coherence time. An exception is 
|fT3l . where, taking the training overhead into account, the 
authors concluded that for nonzero multiplexing gain g™, the 
diversity order saturates as r increases, where r is the number 
of receive antennas. Hence, for fast varying channels, the 
authors suggest turning off receive antennas in order to achieve 
higher multiplexing gains. It is important to account for the 
training duration overhead in deriving the achievable DMT, 
because, as the SNR goes to infinity, although the estimation 
error goes to zero, the training duration overhead remains fixed 
and has a direct impact on the DMT. Also, by modeling the 
CSIT as the sum of the true CSI and an additive error, most 
of the past studies implicitly assume that a channel-agnostic 
orthogonal training signal is employed for channel estimation. 
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When the training signal is channel-dependent, the imperfect 
CSI can no longer be modeled as the sum of the true CSI and 
an additive noise. Due to this, the existing results cannot be 
directly extended to analyze the DMT performance of channel - 
dependent training schemes. 

When the channel is reciprocal and block-fading, e.g., in a 
TDD system, the receiver could exploit its channel knowledge 
(acquired through an initial forward-link training phase) in 
designing its reverse-training sequence, not only to reduce the 
channel estimation error at the transmitter, but also to reduce 
the required training duration overhead. Hence, the goals of 
this paper are two-fold: (a) to analyze the DMT performance of 
a channel dependent training scheme for acquiring CSIT and 
an associated power control mechanism for data transmission; 
and (b) to contrast the DMT performance of the proposed 
training and power control schemes with that achieved by 
conventional channel agnostic training schemes. Our study fo- 
cuses on point-to-point Single Input Multiple Output (SIMO) 
systems. This is of practical importance, since it applies, for 
example, to the uplink of wireless networks where the base 
station has multiple antennas, the mobile users have a single 
antenna, and orthogonal access is used (e.g., OFDM/TDMA) 
as in WLANs and 4G/LTE systems. The channel dependent 
training sequence employed here was first proposed by us in 
||20| and H] in a MIMO and SIMO context, respectively, and 
was independently explored in 1211 . although not in a DMT 
context. 

In this paper, for analytical simplicity and clarity of pre- 
sentation, we start by assuming that perfect CSI is available 
at the receiver, as in lfT0) - lfT2l . We propose a fixed-power 
RCT sequence, using which, the CSI can be estimated at the 
transmitter using a minimum duration of only one symbol, i.e., 
with a factor of r reduction in training duration compared to 
orthogonal RCT. For data transmission, we propose a modified 
truncated channel inversion-type power control scheme based 
on the noisy CSIT. For this system, we show that a diversity 
of d{g„i) = r + 1 — x%x^) achievable. Here, g„i is 
the multiplexing gain, Lc is the coherence time, Lb.t > 1 is 
the reverse training duration, and 1 < s < ?- is a parameter in 
the data power control scheme. (See Section HIH ) 

Next, we consider the more practical case where noisy CSIR 
is acquired via a forward link training sequence, and propose a 
three-way training scheme followed by data transmission. We 
show that a DMT of d{gm) = r(s + 1 - is achievable, 

where /? > 3 is the total training overhead from all three 
training phases, which is again an improvement over conven- 
tional orthogonal training schemes. For example, a nonzero 
diversity order can be achieved with < < 

which is not possible with orthogonal training schemes without 
switching off receive antennas and incurring an associated 
reduction in diversity order. (See Section II VI ) 

Note that although the perfect CSIR case is a special case 
of the three-way training scheme with infinite forward-link 
training power, we briefly present the perfect CSIR case 
also, as it provides insights into the impact of the reverse- 
training and data power control mechanisms on the DMT. 
Moreover, it is useful as an upper bound on the performance 



with imperfect CSIR. Also, we assume that power control is 
employed only at the transmitter and focus on fixed-power 
RCT in the sequel. Using power controlled RCT significantly 
changes the problem; we analyze this case in our follow up 
work 1221 . 

An important implication of our work is that it shows that 
by exploiting the receiver's knowledge of the CSI in designing 
the reverse channel training (RCT) sequence and using our 
proposed data power control scheme, one can achieve a higher 
diversity order than conventional RCT for all values of gm- 
Somewhat surprisingly, we also demonstrate that although the 
DMT analysis corresponds to taking the SNR to infinity, it 
can nonetheless be used to discriminate between different 
training schemes both in terms of the estimation error as well 
as the training overhead. At finite SNR, this translates to an 
improvement in the outage probability performance and the 
achievable data rate, as will be illustrated through Monte-Carlo 
simulations in Section [VTI 

We use the following notation. Bold face letters are used 
for vectors and normal font letters are used for scalars. 
We use E(-) to denote the expected value of (•). We use 
||h||2 to represent the I2 norm of h. The transpose con- 
jugate, absolute value, and real part are denoted by (•)^, 
I • I and 5R{-}, respectively. We write f{P) = to mean 

- limp^oo '°iog^/^ = k. Similarly, we define f{P) < ^ to 

mean - limp^^ ^iJ^P - ^■ 

II. System Model 

The system model consists of two communicating nodes, 
tiode A with a single antenna and node B with r antennas, 
with node A attempting to send data to node B over a 
wireless channel. The forward channel from node A to node 
B, denoted by h e C^^^, is modeled as a Rayleigh flat fading 
channel whose entries are i.i.d. Circularly Symmetric Complex 
Gaussian (CSCG) random variables with zero mean and unit 
variance, i.e., CA/'(0, 1). The channel is assumed to be block- 
fading, i.e., it remains constant for a duration of the coherence 
time Lc, and evolve in an i.i.d. fashion across coherence times. 
We assume a TDD system with perfect reciprocity, and hence, 
taking the complex conjugate of the received signal at yiode 
A, the reverse link channel is h^. We let h = crv, where 
(7 = j|h||2 is the singular value and v = is the singular 
vector of h. Since our goal is to study the achievable DMT 
performance with channel training, we first explain the two- 
way training protocol used for acquiring CSI at node B and 
node A. Later, in Sec. |IV] an additional phase of forward 
link training is introduced, which is not presented here for 
simplicity. 

1) Phase I (Forward-link training): Here, the training 
sequence xa.t = \/PLa,tx is transmitted from node A to 
node B, where La,t, denotes the training duration and P 
is the training poweo Throughout this paper, we use P as 
the average power constraint during both training and data 

'strictly speaking, xa,t = V? is transmitted repeatedly ^A.ri times. 
Mathematically, this is equivalent to using = ^ PLa^ti for ^ duration 

of one unit. 
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transmission. The corresponding received training signal is 
given by, 

YB.T = PLa,ti + ^B,r- (1) 

The entries of wb.t S are assumed to be distributed 

as i.i.d. CJ\f{0, 1). From the received training signal yB,T, 
node B computes an MMSE estimate of h, denoted h. 
The error in the estimate, denoted h = h — h, has i.i.d. 
CM (0, 1/(1 + PLa,ti)) distributed entries. 

In a TDD-SIMO system, node A only requires knowledge 
of (T to perform power control, which in turn improves the 
diversity order compared to the no-CSIT case. Therefore, in 
phase II, we estimate only a at tiode A, using a channel 
dependent training sequence. 

2) Phase II (Reverse-link training): Since node B has an 
estimate (say, v = -^—) of the channel, in this phase, it 

||h||2 

exploits its CSI to transmit the following training sequence 
Q, 1201 ^ 

' (2) 



where Lb.t is the reverse training duration. Using the corre- 
sponding received signal, j/a,t = h^xs + wa.t, where the 
AWGN WA,T G C is distributed as CAf{0, 1), node A computes 
an estimate of the singular value as follows: 



where wa 



a = — , — crU] V V \ 



WA.- 



(3) 



Note that the estimate & could be 



negative; this is taken care of by the power control proposed 
in Sec. |llll which uses a only when it is greater than a 
positive threshold. Since a low or negative a is likely to 
be inaccurate, the thresholding technique helps to avoid the 
poor DMT performance due to such estimates. The RCT 
scheme employed above is different from existing channel 
agnostic methods in that the minimum training length in the 
proposed scheme is only 1 symbol. This represents a factor of 
r reduction compared to orthogonal RCT schemes, where the 
minimum training length increases linearly with r, and this 
difference in overhead could be significant when Lc is small. 
Also, if V is error-free, it is the optimal beamforming vector 
for estimating a at node A. 

3) Multiplexing Gain and Diversity Order: We recall the 
definitions of the multiplexing gain, g^^, and the diversity order 
d from ||2l: 

gm — Jim - — ^ , d = — Jim — ^ , (4) 



Jim 

P^oo logP 



where Rp is the target data rate when the average data 
power constraint is P, and Pout is the corresponding outage 
probability, i.e., the probability that Rp exceeds the channel 
capacity. In this work, the target data rate Rp = gm log P 
is fixed and is independent of the CSIT; the extension of 
our proposed methods to joint rate and power adaptation is 
relegated to future work. The rate of data transmission Rp is 
increased with P by increasing the cardinality of the signal 
set, keeping the symbol duration fixed. We ignore the effect of 
spectral leakage, and assume that the signal bandwidth remains 
fixed as P goes to infinity. Also, we use outage probabiUty as 



a proxy for the probability of error at high SNR with finite- 
length codes; this is because the probability of error can be 
made to decrease as fast as the outage probability using finite- 
length approximately universal codes ll23l . Il24l . 

In the next section, we assume perfect CSI at node B 
and derive the achievable DMT performance of our proposed 
training and data transmission schemes. 

III. DMT Analysis with Perfect CSIR 

When the CSIR is perfect, we have v = v, and in this case, 
it is easy to see that ^ is optimal for estimating a given a 
power constraint P on the training signal. This is because, 
in general, the training signal can be expressed as the linear 
combination xs = (5v + /?vj^, where vj^ is orthogonal to v 
and (5 and /3 are some constants. Then, the received training 
signal at node A is ija.t ~ 5a + wa.t, i-C-, the power in vj^ 
does not help in estimating a. From (|3]l, an unbiased estimator 
of the singular value at node A is given by 



WA' 



(5) 



Note that since the channel is assumed to be Rayleigh fading, 
(T^ is chi-square distributed with 2r degrees of freedom. Also, 
we employ this estimator primarily because we are interested 
in deriving the achievable DMT performance, and for this 
purpose, this simple unbiased estimator is sufficient. 

A. Power-Controlled Data Transmission from Node A to 
Node B 

Given the CSIT a in (|5]l, node A uses a power V{<j) in 
the forward link data transmission phase, to avoid outages 
while satisfying the average data power constraint P. The 
corresponding data signal received at node B is given by. 



YB.d = \JV{a)'hxA,d + vfB,. 



(6) 



where XA,d ^ CAf{0, 1), and with appropriate power normal- 
ization, the entries of the AWGN wb^ G are assumed 
to be i.i.d. CAf{Q, 1). Also, Via) is chosen independent of 
XA,d such that E{P{a)} = P, where the expectation is with 
respect to a given in Q, taken across all coherence blocks. 
Since E{|a;^.c(p} = 1 within a block, this ensures that the 
average data power constraint at node A is satisfied. 

We now present the data power control function V{&) 
considered in this paper Our proposed power control function 
is motivated as follows. The capacity of a fading channel with 
mismatched CSIT and CSIR is not known in closed form ||25]| . 
Since the outage probability computation requires a closed 
form expression for the capacity, we consider a genie-aided 
receiver as in ||26l . where node B is assumed to know 7^((t). 
This is schematically illustrated in Fig.[T| Then, the achievable 
data rate conditioned on the knowledge of yH^Jajh is given 
by m 



c 



(7) 



An outage occurs when Rp, the target data rate, exceeds C 
Its probability is upper bounded by 

Lc — Lb,t 



Pout = Pr 



■\og{l + a'ria)) < Rp 



(8) 
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Fig. 1. System model for reverse channel training with perfect CSIR used 
in Section Iml 



Power constraint: The description of the power control 
would be complete if the parameters n, up and / can be chosen 
such that E[7'((t)] ~ P, which is the essence of the following 
Lemma. 

Lemma 1: Let Op ^ For 1 < s < r, there exists a 



if < / < r 



where a 
1. 



such that E[7'(o-)] = P, 



Proof: See Appendix IbI ■ 

Due to Lemma [T] in the rest of this paper, we consider 
Op = Also, in Sec. |iy] we show that a minor 

modification of the above data power control scheme can 
be employed even with imperfect CSIR. The next subsection 
presents the achievable DMT of the proposed training and 
power control schemes. 



Note that the exact outage probability is obtained by min- 
imizing the right hand side above over all 'P{ct) satisfying 
E{7-'(a')} — P. Hence, using our proposed data power control 
scheme leads to an upper bound on the outage probability, 
which is sufficient for obtaining the achievable DMT perfor- 
mance. If the CSIT is perfect (i.e., = cr^), it is shown in |31 
that the power control that minimizes the outage probability 
is given by 

RpL^ _\ _ 

'-^ . (9) 



2\ A 



exp 



Note that since Rp = .g,„logP and IE{^} = ^{(y^) 
satisfies E{$(fT^)} < P for large enough P, provided g,n < 
{Lc — Lb,t)/ Lc- With inaccurate CSIT, due to the estimation 
error in a, the natural extension of using a transmission 
power of $((T^) could result in allocating insufficient power 
or more power than required, which could lead to suboptimal 
performance. Also, inverting the channel for all values of a 
results in an infinite average power since the Gaussian noise 
can make the estimate a arbitrarily small with a non-zero 
probability. One solution is to use a transmit power of $((T^) 
when (7 > ^0 and a zero power otherwise, where (?o is chosen 
such that E[$((t^)1ct>6((,] = P. The drawback of this method 
is that it results in an outage probability of 1 when a < Oq, 
leading to a zero diversity order To overcome this problem, 
we choose the threshold Oq such that 6*0 — >■ as P ^ oo. 
Moreover, when a < 0^, we do not necessarily want to use 
zero power, since the small value of a could be due to the 
estimation error This motivates the following modified power 
control: 



' ^ ' 1 Kp X a > Op, 



(10) 



where s > 1 is a parameter, and we use Op = n > 0, 
for mathematical tractability. The parameters n, Kp and I > 
are chosen such that E[7'((j)] = P. Although similar power 
control schemes have been employed in the literature with 
perfect CSIT E) or orthogonal RCT d], |[T3], JTH, the form 
in ( [Tol l is new. Specifically, the power control scheme in ||3|, 
ifTIl . |fT3l can be obtained from ([TO]l by setting s ^ I, Op = 
and / = — oo; while that in |fT9l can be obtained by setting 
s ^ r, 9p = and / = — oo. 



B. Achievable DMT Analysis 

Theorem 1: Given r receive antennas and Lb,t training 
symbols being used per coherence interval Lc to estimate 
the CSIT in a SIMO system with perfect CSIR and a genie- 
aided receiver, an achievable diversity order as a function of 
multiplexing gain is given by 



d{9m) = r ( min{?, s + 1} - — 



(11) 



where < ^ < r + 1, 1 < s < r, < < a, and a = 
^"^^^■^ represents the fractional data transmit duration. 
Proof: See Appendix |C] ■ 

Remark: From a DMT perspective, it is clear from Theorem 
[T]that s r, Z = r + 1 is superior to s = 1, / = 2. On the other 
hand, when a <1, $((t^'") could be much greater than $((5"^). 
Thus, in practical systems with a peak power per transmitted 
codeword constraint, s = 1, / = 2 could be preferable over 
s — ?> r, Z = r + 1. In the sequel, for convenience, we associate 
I = 2 with s = 1 and I — r + 1 with s ^ r, and drop the 
explicit dependence of the diversity order on I. Further remarks 
and discussions on the result obtained here are deferred to 
Sec.[V] 

IV. Three Way Training 

In this section, we consider the more practical scenario 
where training is performed in both directions. We show that 
with fixed power training, one can achieve nearly the same 
DMT as derived in Sec. [Illlfor the perfect CSIR case. Unlike 
in the previous section, the analysis presented here is exact, 
in the sense that it does not require the assumption of a genie 
aided receiver, and hence, the DMT derived here is indeed 
achievable in practice. The transmission protocol now consists 
of four phases, as shown in Table U The CSIR and CSIT 
are obtained by transmitting a fixed power training sequence 
in both directions, as explained in Sec. HI] However, even a 
small mismatch in the CSI knowledge at node A and node 
B can potentially lead to a large mismatch in their estimate 
of the data transmit power |[T3l . Thus, it is essential to train 
node B about node A's knowledge of V{(t). This leads to a 
third phase of training, which is an additional power-controlled 
forward-link training phase. First, in the following subsection, 
we explain the power control scheme that is employed here. 
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A. Power Control Scheme 

The power control scheme we propose to employ in this 
section is as given by ( fTol i. due to the following. Let h denote 
the MMSE estimate of the channel at node B, and consider a 
in (O. We have 



yJPL 



} + 



where w, 



:+Weff, 



VPLb^ 

(12) 

Note that h and h are 



independent Gaussian random variable^. Since v is uniformly 
distributed on the unit sphere and is independent of h, 
3fi{h^v} is Gaussian distributed. This implies that the effec- 
tive noise, Weff, is Gaussian distributed with E|iDe//p ^ -p 
and independent of h. Therefore, the estimate of the singular 
value at node A is statistically similar to the estimate given by 
(|5]l in the perfect CSIR case. Thus, we use a similar power 
control, V{a) in ( fTOl i. where a is given by (fT2b . Also, with 

where 



a slight abuse of notation, a 
La,t2 is the training duration in the third phase of training 
(phase III), which is in the forward-link direction. 

In this section, without loss of generality, we move the 
power scaling \/p into the data symbol transmitted by node A, 
so that E{7'((t)} = 1 (see (fljT l below), where the expectation 
is taken with respect to the distribution of a in (fT2] i. Now, in 
the proof of Lemma [T] using the probability density function 
(pdf) of ||h||2 in place of the pdf of cr, and noting that the 
effective noise variance = l/P, we get Kp = pcjl^/c ™d the 
constraint < / < r to satisfy = 1 at high SNR. In 

the next subsection, we explain the third round of training that 
alleviates the mismatch in the knowledge of the data transmit 
power 



From ( fT4l i. node B computes an MMSE estimate of p^, 
denoted by p^. Let p^ = p^ — Pc. Although a closed 
form expression for p^ is hard to find, the error p^ in the 
MMSE estimate has the following interesting property, which 
facilitates the calculation of the outage probabiUty in Sec. 
IIV-DI An analogous result has been shown in lIZTl for the 
scalar case. 

Lemma 2: E||pc||2^ ^ for every z > 0. 
Proof: See Appendix ID] ■ 



C. Phase IV (Data Transmission) 

Using V{<j), node A sends the data signal x = 
PV {a)x A.d^ where XA,d is distributed as CA/'(0, 1) and is 



independent of V{a-). Note that 



= P by construction. 



where the expectation is taken with respect to both a and XA,d- 
The corresponding signal received at node B is 



PV{a)\yxA,d + ^B4 (15) 

PiicXA,d + ^^cXAA + '^B.d- (16) 



Since Pc is an MMSE estimate, using the worst case noise 
theorem jSj, we have the following lower bound on the mutual 
information, I{xA,d]yB,d\f*c) > Cab, where 



Cab ^ a log 1 + 



^llPclli 



1 A Lc_ — L B T — L ^ 

and a = r- 



^E[||Pc||^|yB,rJ+l, 



(17) 



is the fractional data transmit 
duration after accounting for the time overheads in all three 
training phases. 



B. Phase III (Power-Controlled Forward Link Training) 

In this phase, node A transmits the training sequence: 
XA,T2 = V PLa,t2\/T^{<^), where La.t2 is the training du- 
ration. The corresponding received training signal at node B 
is given by, 



PLa,t2 \/'P{cf)\i + ^KB,T2 , 



(13) 



where w^^^^ e C"^^ is the AWGN with C7V(0, 1) entries. 
The goal at node B is to estimate the composite channel Pc = 
^/vta)h.. Dividing (O by ^JPLa.t2, we get 

YB.T2 „ , ^B,T2 



yB,T2 



= Pc 



(14) 



VPLA,r2 ^JPLA,r2 
— > h as P — oo. Moreover, ||h||2 is a chi distributed random variable. 



D. DMT Analysis With Three-Way Training 

Theorem 2: For a SIMO system with r receive antennas 
and three phases of training and the data transmission phase 
as described in Table H] an achievable DMT is given by 



digm) 



r I miniZ, s} + 1 — '■ — 
a 



(18) 



where < ^ < r, 1 < s < r, < (?,„ < a, and a 



Proof: See Appendix |E] ■ 

Remark: The above three way training scheme can be 
generalized to k training rounds to improve the diversity order, 
as in ifTTl . lfT9l . However, this is mathematically cumbersome 
and out of the scope of our work. 



TABLE I 

Three Way Training in a TDD-SIMO System 



Phase 


Description 


Input-Output Equation 


/ 


Fixed power training (Node A — > Node B) 


yS.T = hX'A.T + ^B,T 


II 


Fixed power training (Node B Node A) 


yA,T = ii"^B,T + WA,T 


III 


Power controlled training (Node A — > Node B) 


yB,T2 =\l PLa,T2'P(^)^ + 'WB,T2 


IV 


Power controlled data (Node A — >■ Node B) 
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Fig. 2. The achievable DMT with the training and power control scheme 
proposed in Sec. lIIII compared with the performance with orthogonal RCT and 
the data power control proposed in 1131 . 1191 (and appropriately accounting 
for the training duration overhead and switching olf antennas to achieve higher 
values of Qm)- The plot con'esponds to a SIMO system with r = 5 antennas, 
with coherence time Lc = 20 symbols, and reverse training duration of 
Lb,t = 1 symbol. 



Fig. 3. Outage probability versus the average data power P for the fixed- 
power training scheme proposed in Sec. IIIII with the data power control 
scheme given by jlOt with s = 1. Here, r = 3, Lc = 40 and Lg^ = 1. 
With Qm = 0.8, the target data rate was set as /?p = 4 + gm logP to 
facilitate the comparison of the curves. 



V. Discussion 

Recall that with perfect CSIR and imperfect CSIT, with 
I > s + 1, and for a genie aided channel, it was shown in 
Theorem [T] that the following DMT is achievable: 



digm) 



1 



iLr_ 



Lr 



-Lb 



(19) 



In contrast, for the 



where 1 < s < r, Q < gm < 

same genie aided channel, it was shown in II26II that a diversity 
order of 



ds{gn 



gi 



l.Lr 



Lc - rLi 



< 5m < 



Lr — rLf 



(20) 

is achievable using orthogonal reverse channel training. Note 
that ds{gm) saturates as r gets large, as opposed to ( fT9] l, 
which is monotonically increasing in r. In order to achieve 
a g„i > ir^j^LLs^^ jjj f[^^ the authors suggest turning 
off one receive antenna at a time to reduce the training 
burden until r = 2. For example, turning off one antenna. 



5m e 



-rLi 



is achievable at a reduced 



diversity order of 4(5^) = (r - 1) 2 - yjjj^ifrrijz 
This is in contrast to our result, which can accommodate a 
larger multiplexing gain, g,n < ^"^^^-^ irrespective of ?■, 
while simultaneously achieving a higher diversity order at each 
gm- We note that for a SIMO channel, a diversity order of 
r(r + 1 - .g„i) for < 5™ < 1 was obtained in ini, lfT9l . 
using channel-independent training, and without accounting 
for the training duration overhead. This corresponds to taking 
Lc 00 in ( fT9] l. The performance of the proposed scheme 
is schematically contrasted with orthogonal RCT in Fig. |2] 
for a SIMO system with r = 5, ic = 20, and Lb^t = 1 
symbol. The advantage of the proposed scheme at higher 
values of the multiplexing gain is clear from the plot. The 



proposed training scheme thus results in a factor r-reduction 
in the training duration, which, along with the proposed data 
power control scheme, translates to an increase in the range of 
achievable multiplexing gains, while simultaneously offering 
a better diversity order compared to orthogonal RCT schemes. 

Comparing Theorems [T] and |2] we see that the DMT 
performance of a genie aided receiver with perfect CSIR is an 
upper bound on the performance of the system with imperfect 
CSIR and CSIT, as expected. Also, the performance of the 
two systems is similar, except that in the latter case, the factor 
a captures the loss in data transmission time due to all three 
training phases. Similar observations as the above regarding 
the improvement in DMT can be made for the three way 
training scheme compared to orthogonal RCT schemes. 

VI. Simulation Results 

We now briefly present Monte-Carlo simulation results to 
illustrate the outage probability performance of our proposed 
RCT and forward-link data power control schemes. We con- 
sider a Rayleigh fading channel with three receive antennas. 
We calculate the outage probability by averaging over 10^ i.i.d. 
channel and training noise instantiations. We set the channel 
coherence time and reverse training duration as Lc = 40 and 
Lb.t = 1, respectively. Figures |3] and |4] show the outage 
probability of the proposed fixed-power training scheme and 
the data power control scheme in (fTOl i with s = 1 and 
s = r = 3, respectively, as a function of P, with g^ ~ and 
Rp = 4 bits/channel use (1 and 1.5 bits/channel use in case of 
Fig. m, and with g^ = 0.8. Although the slopes of the curves 
do not match with the theoretical diversity order because the 
latter requires infinite SNR, the improved performance of the 
proposed schemes is clear from the graphs. Also, in Fig. 
|3] since the proposed scheme uses only Lb,t = 1 training 
symbol while the orthogonal RCT scheme uses tLb.t ~ 3 
training symbols, the former shows a higher outage than the 
latter at lower SNRs. Note that, we have not plotted the outage 
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Proof: The result follows from 



14 16 
Power in dB 



18 20 



Fig. 4. Outage probability versus the average data power P for the tixed- 
power training scheme proposed in Sec. IIIII with the data power control 
scheme given by jlOt with s = r. Here, r = 3, Lc = 40 and Lb,t = 1- 



performance of the three-way training scheme in Sec. |IV] This 
is because the outage probability is hard to compute, since a 
closed-form expression for Pc is not available. 

VII. Conclusions 

This paper proposed reverse training and data power control 
schemes for a TDD-SIMO system with perfect/imperfect CSIR 
and investigated its DMT performance. It was shown that 
a diversity order of d{gm) ~ r i^s + \ — is achievable 
for l>s + \, l<s<r and < < where a 
represents the fractional data transmit duration. In contrast to 
channel agnostic orthogonal training schemes, the diversity 
order was shown to increase monotonically with r at nonzero 
multiplexing gain, which is a significant improvement. The 
DMT analysis was extended to a more practical situation 
where the training is done in both directions. In this case 
also, it was shown that the DMT performance can improve 
quadratically with the number of receive antennas, and nearly 
the same DMT can be achieved as that with perfect CSIR 
and a genie-aided receiver In terms of system design for 
reciprocal SIMO systems, the key messages from this work 
are that it is important to (a) exploit the CSI at the receiver in 
designing the RCT and (b) use a modified channel-inversion 
type power control scheme that transmits data at some non- 
zero power even when the estimated singular value at the 
transmitter is poor For fast varying channels, these ingredients 
can lead to a significant advantage in DMT performance, 
which, at finite SNR, can translate to a large improvement 
in outage probability performance compared to orthogonal 
training schemes. Future work could extend the DMT analysis 
to a time-selective block fading reciprocal channel, where the 
channel is correlated within a block 

Appendix 

A. Useful Lemmas 

Lemma 3: If the random variable is a chi-square dis- 
tributed with 2r degrees of freedom, then Pr{a^ < z} < 
z>0. 



Pr{a^ < z} ^ 



1 



< 



(r - 1)! Jo 



dx 



(21) 
(22) 



Lemma 4: For the system in Q, \a\ < au, where afj = 

Proof: We upper bound the absolute value of y) as follows: 

\a\ < a|3?{v«v}| + \^^\ < a + |^^,|, (23) 

V p^Lb,t 

where (a) follows from the triangle inequality and (b) follows 
since |5R{v^v}| < 1. ■ 

B. Proof of Lemma [7] 

Consider the following constraint on the data power 



Via)faia;P)d&^P, 



(24) 



where faid", P) is the pdf of a. Substituting (fTOl i in ( l24l i. we 
get 



E[-P(6-)] = Kp 



cxp 



LcRp 
Lc — Lb.t 



- 1 



F(P) + /p, (25) 



where Rp is the target data rate and the data transmit power 
is P, 



F{P) 



fa{x;P)dx and Ip = P' 



fa{x;P)dx. 
(26) 



The proof is complete by choosing 
1 



(27) 



and showing that Ip < P and that P(P) is bounded for large 
P when < ^ < r + 1 and n = 1/2. Fron 
P' Pr{fT + WA,T < Op} can be bounded as. 



(a) pi ^ 



Ip < —nOp - WA.rf = — e 

j=0 
id) 1 



P' max 



1 



(28) 



3e{0,h...,r} P2{r-j}n+j pr-V 

where (a) follows from Lemma [3] above, and the expectation 
is with respect to the distribution of wa.t, (b) follows from 
the binomial expansion and the fact that Ewa r ~ when i 
is odd, (c) follows from Op = ^ and 'Kw'^J^^ = -pj, and (d) 
follows by substituting n ~ 1/2 in the left hand side. From 
( |28] ), clearly, Ip < P for large P if / < r + 1 and n = 1/2. 
When I ^ r+1 and n = 1/2, we have Ip ^ P, and therefore 
we can ensure that Ip < P for large P by scaling Ip by an 
appropriately chosen constant scaling factor 

Next, we show that P(P) is bounded. Note that 

/ -^fa{x;P)dx+ [ -^fa{x;P)dx. (29) 
Jep ^ Ji ^ 
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Now, it is sufficient to show that the first integral in (129) is 
bounded, since the second integral is clearly < 1. To this end, 
we need the distribution of a, i.e., Pr (a + 'WA,r < a;), where 
WA,r ~ J^{0,<T^,ar), and (T^^^ = ^ — . Consider 



2PLe 



G{x) = Pr (cr + WA.T < x) 



My) 



72o 



2na„ 



(30) 

•^^dzdy, (31) 



where fa{y) is the pdf of a, which is chi distributed with 2r 
degrees of freedom. Taking the derivative of (130) with respect 
to X, we get 



dGjx) 
dx 



J 



y2r-lg-J^g ^-titdy (32) 



'i.'Ka.var JO 

^irayar Jo 



dy, (33) 



where J is the constant term in the standard chi pdf, (3i = 
, ^2 = = ^ and /33 ^ /32xV(2a2„J. Let 

t and using the binomial expansion, it can be shown 



l+cr2 ' ~ 1+(t2 

that 



dGjx) 
dx 



Jexp(-/33) ^2r - 1 

var j—Q 
oo 



271 a 



(v^) 



2r-j 



— ^ / t^-^-^-^e-Vdi. (34) 



Now, using exp(— /Js) < 1, we can upper bound the first 
term in (|29]l as 



1 dGjx) 



dx < 



2'K(Ty 



2r-l 

■E 

J=0 



2r- 1 



7?^?^ ^'^'^dx, (35) 

1 + ^vary J Bp 



C. Proof of Theorem [7] 

Using the power control in (fToi . the outage probability in 
(|8]l can be written as 

Pout = Pr {alog(l + P'a^) < ii-p} (37) 
{5-<ep} 

+ Pr |alog(l + Kp$(o-2-nCT2) < i?p| (38) 
{*>ep} 



(39) 



where Hi = Pr {a log(l + P'cr^) < i?p}, and Ha = 
Pr {alog(l + Kp<i>((T^''")o-^) < i?p}. In the above, we have 
used Pr{^}{-} to mean Pr{-P|{A}}. Using Rp ~ _g„logP, 

we have Hi — Pr ^a"^ < _, ^j/„, | for large P and < / < 
r + 1 from Lemma [T] From Lemma |3] in Appendix |A] we 
have. Hi ^ ~^7~£SIJ~- Next, substituting for ^((t^'^) from 

(|9]l, 112 can be written as, 112 = Pr{cr^ < a^'^/Kp] . Using 
(T^ < (T^ = ((7+ |u>^ ,-|)2 from Lemma|4]in Appendix lAl with 
~ a^, we get 



(40) 



n2 < PtU' < —{a+\wA,r\) 



\2s 



< Prl^a'<^^f]a^>\wA,r\' 

+ pr{.^<i^^^n-^^i*-i^}-(4i) 

It is straightforward to show that provided Kp is strictly 
increasing with P, the first term in the above goes to zero 
exponentially with P for 1 < s < r. This implies that g„i < a, 
since Kp = pi^^~^) from Lemma [T] The second term in (l4Tl i 
is upper-bounded as 



Pr^a^<^ 



(fc) 



1 



(43) 



where s < r, and Cj = 1 is some constant that does not Pr /a^ < l^'-^-^l'^'^"'" ] -< 



scale with P. Now, the behavior of the terms above with P is 
governed by 



where (a) follows from Lemma [3] in Appendix |A] 
and the = in (b) uses the fact that Kp = pi^~~) 
and E|iZ;a,tP*^ = 1/P'*''. Hence, we have 

which implies 

in 



n2 d — 

we have 



P2 



r:^~^'dx 



1 



j -2s + l 



1 

pai 



1 

pa2 



(36) 



-< max 



Using this and XIi -< — 

- pH'- 

1 



pr 



where ai = r — j/2 — 1/2, and 02 == (—2s + j + l)n + 
r — j /2 — 1/2. The exponent corresponding to the first term 
above is r - j /2 - 1/2 > for all < j < 2r - 1. Also, 
when n = 1/2, the exponent corresponding to the second term 
above is ?- — s > for all < j < 2r — 1, and hence the 
integral is bounded for 1 < s < ?'. 

Finally, let Rp = log(P). Since Ip < P and P(P) are 
bounded when < ^ < r + 1, using ^cxp (^77%xf" 

A La- 



jjr (min{ i , s+ 1 } — ) 



(44) 



(45) 



P in ( |27] i, we get up = _ ^ 
completes the proof of Lemma [T] 



where a 



This 



for 0</<r+l, \ < s < r and < 5m < This ends the 
proof of Theorem [T] ■ 

D. Proof of Lemma |2] 

Note that Pc can be written as 

Pc = Pc - ys.ra - E{Pc - V B ,T2 lys.ra} (46) 
= /or [EjwB^^JyB^^J - Wp,^J . (47) 
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Now, 



EllPclir 



(a) 
< 



< 



1 



{A} 



(48) 



{22-|lE{ws,.JyB,.JIir} 

+ 22^Ew,,., {||wB,,J|r}] (49) 
2^^+! „o, . 1 



-E||WB, 



1-2 I! 2 



^,T2 



(50) 



where A = \\^{"^ b ,T2\y b ,t2} ^ ^b.t2\\\^- In the above, (a) 
follows from the triangle inequality and using (a + 6)" < 
(2a)" + (2fe)" for even n > 0, and (b) follows from the 
Jensen's inequality and the fact that E||ws t-2||2^ < oo as 
P ^ oo. The subscripts on the expectation in the above denote 
the random variables over which the expectation is taken; and 
E{X|y} denotes the expectation of the random variable X 
conditioned on the instantiation Y = y. This completes the 
proof. ■ 

E. Proof of Theorem |2] 

Using the capacity lower bound in (fTTI i. the outage proba- 
bility can be upper bounded as 



Pout < Pt{Cab < Rp}, 



(51) 



where Rp = log P is the target data rate. We choose r/ < 1, 
and arbitrarily close to 1. We split the event in the expression 
for Pout as 

Pout < Pr I Cab < Rp n E[||p,||2|yB.r2] < (52) 



Pr<^C^B < i?p nE[||pe||^|yB,.,] > 



pn 



(a) 

< Plia l02 1 



P\\P. 



'c\\2 



p(i-v) 



< Rp 



Pr 



|Pc|i2lys,r2] > p;; 



(53) 



where (a) follows by substituting 1/P^ in place 
of E[||pc|!2lyB,r2] in the first term, and removing 
one of the events in the intersection. Define 

A (cxp{flp/a}-l) I^P^^^y that 



Rp — 
written as: 



Then, the first term in 



can be 



Pr {||p,||2 < Rp} < Pr ||||p,j|2 - llPclbl < (54) 



< Pr {^1 fl } + Pr {Ei f| E^} 

< Prjllpclla > \f^t 



+ Pr{||p,||2 <4i?p}, (55) 

where E^ ^ {||p,||2 < ||pc||2 + /rJ} and E2 = {||pc!|2 > 
y^Rp}. In the above, (a) follows from the reverse triangle 
inequality, and the last two inequalities follow by ignoring 



one of the events in the intersection. The first term in (|55) can 
be written as 



Pr{llPcllf >i?M < 



s^i-) E||p,||r W 1 



R'p 



1 



pS p{^-n)s'- 



(56) 



where (a) follows from the Markov inequality and (b) follows 



from Lemma 12] Letting 5 
have 



Pr 



lPc|l2 > 



1 



1 



> 0, we 



"-y 



1 < s < r. 



(57) 



pr(s+l- 

In order to solve for the second term in (l55i , we need to handle 
two cases of the singular value estimate at yiode A separately; 
the good estimated channel case g = {cr > Op} and the bad 
estimated channel case h = {a < Op}. 

1) Good Estimated Channel {a > Op}: When a > Op, 
substituting for Pc = ^/V{B)h and K.p ^ P^ ^ , and defining 
<^u — (ct + |?DA,r|) as the upper bound on & from Lemma |4] 
in Appendix lAl the second term in ( |55T l leads to: 



Pr {Es} 



(a) 



'< Pr{a2 <22(-'+i)a2^i?pf|i?4} 

Fr[a' <2'^'+'^\wAA^'Rpf]El} 



< Pr|a2(--i)> 
+ Pr|cr 



Rp 



<22(«+i)|iDAr|''i?p}, (58) 



where E3 = {4^ < 4i?p}, and Et ^ {a^ > \waA''}- 

In the above, we have used Pr{^j.{-} to mean Pr{-P|{A}}, 
as before; and (a) follows by ignoring the event g. It can 
be shown that first term in (|58t decreases exponentially with 
P =-1 , as follows: 



Pr{6} 



(a) poo 



-X ^r— 1 



l/pl/(=-l) 



dx 



(59) 



cxp 



{-i/R^-''} 



r-l 



^ ('nl/(''-l)\r-fc-l 
k=0 I 



(b) 



(60) 



2(s-l) 



> 



-}, and Z = P 



In 



where B ^ {a ■ ^ W^tttr^ 
the above, (a) follows by ignoring the constant factors and 
substituting for the chi-square pdf of cr^. Since l/Rp = 
7]a, and since the exponential term 
outside the summation dominates the polynomial terms inside 
the summation, we obtain (b). Note that the special case of 
s = 1 is trivial, since this corresponds to the probability that 
Rp exceeds a constant, which becomes for sufficiently large 
P. The second term in dSUl becomes: 



Pv{a' <2'^-^+'^\wA,r\''Rp} 



< 



22r(.+l)prE||^^^^|2sr| 



1 



p{iri-^)r prs 
1 



(61) 
(62) 
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for < < In the above, we have used Lemma [3] in 
Appendix lAl and E|u>^ — Thus, in the good estimated 
channel case, we have 

^ Pr ^ {IIpcII^ < 4i?p} ^ _ . ^ . , < 5,n < 

(63) 

2j Bad Estimated Channel {a < Op}: Recall that when 
a < Op, the composite channel is given by -pc = VWh. With 
this, the second term in ( fSSl l becomes 

^PrjMl<iRp} = Pr|||h||^<l|^n^ 



< Pr <! < 
1 



-pT 



1 



prl pr(^-am.+ri) 
1 



(64) 
(65) 
(66) 



where < I < r. This completes the analysis of the first term 
in (I53]l. 

Now, the second term in (|53]| can bounded as: 



Pr (E[|ip, 



l2|yB,Tj 



> 



1 



< E(E[||p,||^|yB.r. 



< 

{c) 
-< 



1 



(67) 
(68) 



PC(1-'?) ' 

where > is an arbitrary number. In the above, (a) and (b) 
follow from the Markov inequality and Jensen's inequality, 
respectively, and (c) follows from Lemma|2] Since r/ < 1, and 
( can be chosen arbitrarily large, the second term in (ISJj goes 
to zero with an arbitrarily large exponent as P goes to infinity. 

Putting dST]), ^ and ^ together, a DMT of 

d{g,n) = (min{^, s} + r/ — ^) is achievable, for < ^ < 
1 < s < r and < gm < i]a. Noting that rj is arbitrarily close 
to 1 completes the proof of Theorem |2] ■ 
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